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Introduction. In early 2005, Dan Goldston, Janos Pintz, and Cem Yildirim [12] made a 
spectacular breakthrough in the study of prime numbers. Resolving a long-standing open 
problem, they proved that there are infinitely many primes for which the gap to the next 
prime is as small as we want compared to the average gap between consecutive primes. 
Before their work, it was only known that there were infinitely many gaps which were about 
a quarter the size of the average gap. The new result may be viewed as a step towards the 
famous twin prime conjecture that there are infinitely many prime pairs p and p + 2; the 
gap here being 2, the smallest possible gap between primes 1 . Perhaps most excitingly, their 
work reveals a connection between the distribution of primes in arithmetic progressions 
and small gaps between primes. Assuming certain (admittedly difficult) conjectures on the 
distribution of primes in arithmetic progressions, they are able to prove the existence of 
infinitely many prime pairs that differ by at most 16. The aim of this article is to explain 
some of the ideas involved in their work. 

Let us begin by explaining the main question in a little more detail. The number of 
primes up to x, denoted by 7r(x), is roughly x/logx for large values of x; this is the 
celebrated Prime Number Theorem 2 . Therefore, if we randomly choose an integer near 
x, then it has about a 1 in logx chance of being prime. In other words, as we look at 
primes around size x, the average gap between consecutive primes is about logx. As 
x increases, the primes get sparser, and the gap between consecutive primes tends to 
increase. Here are some natural questions about these gaps between prime numbers. Do 
the gaps always remain roughly about size log x, or do we sometimes get unexpectedly large 
gaps and sometimes surprisingly small gaps? Can we say something about the statistical 
distribution of these gaps? That is, can we quantify how often the gap is between, say, 
alogx and /31ogx, given < a < (31 Except for the primes 2 and 3, clearly the gap 
between consecutive primes must be even. Does every even number occur infinitely often 
as a gap between consecutive primes? For example, the twin prime conjecture says that the 
gap 2 occurs infinitely. How frequently should we expect the occurrence of twin primes? 

Number theorists believe they know the answers to all these questions, but cannot 
always prove that the answers are correct. Before discussing the answers let us address a 
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1 apart from the gap between 2 and 3, of course! 

2 Here, and throughout, log stands for the natural logarithm. 
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possible meta-question. Problems like twin primes, and the Goldbach conjecture involve 
adding and subtracting primes. The reader may well wonder if such questions are natural, 
or just isolated curiosities. After all, shouldn't we be multiplying with primes rather than 
adding/subtracting them? There are several possible responses to this objection. 

Firstly, many number theorists and mathematical physicists are interested in under- 
standing spacing statistics of various sequences of numbers occurring in nature. Examples 
of such sequences are prime numbers, the ordinates of zeros of the Riemann zeta-function 
(see [21] and [23]), energy levels of large nuclei, the fractional parts of \fn for n < N (see 
[7]), etc. Do the spacings behave like the gaps between randomly chosen numbers, or do 
they follow more esoteric laws? Our questions on gaps between primes fit naturally into 
this framework. 

Secondly, many additive questions on primes have applications to other problems in 
number theory. For example, consider primes p for which 2p+l is also a prime. Analogously 
to twin primes, it is conjectured that there are infinitely many such prime pairs p and 2p+l. 
Sophie Germain came up with these pairs in her work on Fermat's last theorem. If there 
are infinitely many Germain pairs p and 2p + 1 with p lying in a prescribed arithmetic 
progression, then Artin's primitive root conjecture — every positive number a which is 
not a perfect square is a primitive root 3 for infinitely many primes — would follow. For 
example, if p lies in the progression 3 (mod 40), and 2p + l is prime, then 10 is a primitive 
root modulo 2p + l, and as Gauss noticed (and the reader can check) the decimal expansion 
of l/(2p + l) has exactly 2p digits that repeat. There are also connections between additive 
questions on primes and zeros of the Riemann zeta and other related functions. Precise 
knowledge of the frequency with which prime pairs p and p + 2k occur (for an even number 
2k) has subtle implications for the distribution of spacings between ordinates of zeros of 
the Riemann zeta-function (see [1] and [23]). Conversely, weird (and unlikely) patterns in 
zeros of zeta-like functions would imply the existence of infinitely many twin primes (see 
[17])! 

Finally, these 'additive' questions on primes are lots of fun, have led to much beautiful 
mathematics, and inspired many generations of number theorists! 

Cramer's model. A useful way to think about statistical questions on prime numbers is 
the random — also known as Cramer — model. The principle, based on the fact that a 
number of size about n has a 1 in logn chance of being prime, is this: 

The indicator function for the set of primes (that is, the function whose value at n 
is 1 or depending on whether n is prime or not) behaves roughly like a sequence of 
independent, Bernoulli random variables X(n) with parameters 1/logn (n > 3). In other 
words, forn > 3, the random variable X(n) takes the value 1 (n is 'prime') with probability 
1/logn, and X(n) takes the value (n is 'composite') with probability 1 — 1/logn. For 
completeness, let us set X(l) = 0, and X{2) = 1. 

This must be taken with a liberal dose of salt: a number is either prime or composite, 
probability does not enter the picture! Nevertheless, the Cramer model is very effective 
in predicting answers, although it does have its limitations (for example, if n > 2 is prime 
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then certainly n + 1 is not, so the events of n and n + 1 being prime are clearly not 
independent) and sometimes leads to incorrect predictions. 

Let us use the Cramer model to predict the probability that, given a large prime p, the 
next prime lies somewhere between p + a logp and p + (3 log p. In the Cramer model, let p 
be large and suppose that X(p) = 1. What is the probability that X(p + 1) = X(p + 2) = 
. . . = X(p+h — 1) = and X(p + h) = 1, for some integer h in the interval [a logp, /?logp]? 
We will find this by calculating the desired probability for a given h in that interval, and 
summing that answer over all such h. For a given h the probability we seek is 

(l 1 \ /- 1 \ /. 1X1 



lOv 1 log(v + 2) ) " ' v 1 log(v + h-l)) 



log(p + l)/V log(p + 2)/ V log(p + h - 1)/ log(p + h) 

Since p is large, and h is small compared to p (it's only of size about logp) we estimate 
that log(p + j) is very nearly logp for j between 1 and h. Therefore our probability above 
is approximately (1 — 1/ logp) h ~ 1 (l/ logp), and since 1 — 1 /log pis about e _1//logp , this is 
roughly 

e -(h-i)/iogpf 1 

V log p . 

Summing over the appropriate h, we find that the random model prediction for the prob- 
ability that the next prime larger than p lies in [p + a logp, p + (3 logp] is 



V e -(h-i)/iogpJ_» f P e -*dt, 



a logp<h</3 logp 

since the left hand side looks like a Riemann sum approximation to the integral. 
Conjecture 1. Given an interval < a < (3, as x — > oo we have 

1 /"^ 
—r-:#{P < x :p n ext G (p + a logp, p + /3 logp)} -> / e 

where p nex t denotes the next prime larger than p. Here, and throughout the paper, the 
letter p is reserved for primes. 

We have deliberately left the integral unevaluated, to suggest that there is a probability 
density e~ l of finding (p nex t — p)/ logp close to t. If we pick iV random numbers uniformly 
and independently from the interval [0,iV], and arrange them in ascending order, then, 
almost surely, the consecutive spacings have the probability density e~ l . Thus, the Cramer 
model indicates that the gaps between consecutive primes are distributed like the gaps 
between about x / log x numbers chosen uniformly and independently from the interval 
[0,x]. In probability terminology, this is an example of what is known as a 'Poisson 
process.' 

There are several related predictions we could make using the random model. For 
example, choose a random number n below x, and consider the interval [n, n + log n]. The 
expected number of primes in such an interval is about 1, by the prime number theorem. 
But of course some intervals may contain no prime at all while others may contain several 
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primes. Given a non-negative number k, what is the probability that such an interval 
contains exactly k primes? The reader may enjoy the pleasant calculation which predicts 
that, for large x, the answer is nearly -^e~ — the answer is written so as to suggest a 
Poisson distribution with parameter 1. 

Conjecture 1 makes clear that there is substantial variation in the gaps between con- 
secutive primes. Given any large number A we expect that with probability about e _A (a 
tiny, but positive probability), the gap between consecutive primes is more than A times 
the average gap. Given any small positive number e we expect that with probability about 
1 — e _e (a small, but positive probability), the gap between consecutive primes is at most 
e times the usual gap. Thus, two consequences of Conjecture 1 are 

run sup — = oo, 

p^oo logf) 



and 

p^oo logp 



Iim mi — = 0. 



Large gaps. Everyone knows how to construct arbitrarily long intervals of composite 
numbers: just look at ml + 2, ml + 3, . . . , ml + m for any natural number m > 2. This 
shows that lim supp^^ (p nex t — p) = oo. However, if we think of ml being of size about x 
then a little calculation with Stirling's formula shows that m is about size (logic)/ log log x. 
We realize, with dismay, that the 'long' gap we have constructed is not even as large as 
the average gap of log x given by the prime number theorem. A better strategy is to take 
A" to be the product of the primes that are at most m, and note again that N + 2, . . . , 
N + m must all be composite. It can be shown that A" is roughly of size e m . Thus we have 
found a gap at least about logA^, which is better than before, but still not better than 
average. Can we modify the argument a little? In creating our string of m — 1 consecutive 
composite numbers, we forced these numbers to be divisible by some prime below m. Can 
we somehow use primes larger than m to force N + m+l, N + m + 2, etc., to be composite, 
and thus create longer chains of composite numbers? In the 1930s, in a series of papers 
Westzynthius [27], Erdos [8] and Rankin [25] found ingenious ways of making this idea 
work. The best estimate was obtained by Rankin, who proved that there exists a positive 
constant c such that for infinitely many primes p, 

(log log p) log log log log p 

Pnext -V > cl °gP 7, — i — i ^2 • 

(log log logp) z 

The fraction above does grow 4 , and so 

limsup : = oo, 

p^oo logf) 

as desired. We should remark here that, although very interesting work has been done on 
improving the constant c above, Rankin's result provides the largest known gap between 
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primes. Erdos offered $10,000 for a similar conclusion involving a faster growing function. 
Bounty hunters may note that the largest Erdos prize that has been collected is $1,000, by 
Szemeredi [26] for his marvellous result on the existence of long arithmetic progressions in 
sets of positive density. 

What should we conjecture for the longest gap between primes? Cramer's model sug- 
gests that 

/-i \ V Pnext ~~ P 

with c = 1. The rationale behind this is that the probability that X(n) = 1 and that the 
next 'prime' is bigger than n + (l + e) log 2 n is about l/(n 1+e logn), by a calculation similar 
to the one leading up to Conjecture 1. If e is negative the sum of this probability over all 
n diverges, and the Borel-Cantelli lemma tells us that, almost surely, such long gaps occur 
infinitely often. If e is positive, the corresponding sum converges and the Borel-Cantelli 
lemma says that almost surely we get these longer gaps only a finite number of times. 
More sophisticated analysis has however revealed that (1) is one of those questions which 
expose the limitations of the Cramer model. It appears unlikely that the value of c is 1 as 
predicted by the Cramer model, and that c should be at least 2e~ 7 ~ 1.1229 where 7 is 
Euler's constant. No one has felt brave enough to suggest what the precise value of c should 
be! This is because (1) is far beyond what 'reasonable' conjectures such as the Riemann 
hypothesis would imply. An old conjecture says that there is always a prime between two 
consecutive squares. Even this lies (slightly) beyond the reach of the Riemann hypothesis, 
and all it would imply is that 

-, . Pnext ~~ P . , 

limsup — — < 4; 

p^oo \/P 

a statement much weaker than (1) with a finite value of c. 

We cut short our discussion on long gaps here, since our focus will be on small gaps; 
for more information on these and related problems, we refer the reader to the excellent 
survey articles by Heath-Brown [18] and Granville [15]. 

Small gaps. Since the average spacing between p and p nex t is about logp, clearly 

, . . r Pnext P 1 

iiminr — < 1. 

p^oo logp 

Erdos [9] was the first to show that the liminf is strictly less than 1. Other landmark 
results in the area are the works of Bombieri and Davenport [3], Huxley [20], and Maier 
[22] , who introduced several new ideas to this study and progressively reduced the lim inf 
to < 0.24 Enter Goldston, Pintz, and Yildmm: 

Theorem 1. We have 

, . . c Pnext ~~ P n 

iiminr — = 0. 

p^oo logp 

So there are substantially smaller gaps between primes than the average! What about 
even smaller gaps? Can we show that lim infp^oo (p next — p) < 00 (bounded gaps), or 
perhaps even lim inf p ^oo (pnext — p) = 2 (twin primes!)? 



6 



K. SOUNDARARAJAN 



Theorem 2. Suppose the Elliott- Halberstam conjecture on the distribution of primes in 
arithmetic progressions holds true. Then 

liminf(p next - p) < 16. 

p— >oo 

What is the Elliott-Halberstam conjecture? One valuable thing that we know about 
primes is their distribution in arithmetic progressions. Knowledge of this, in the form of the 
Bombieri- Vinogradov theorem, plays a crucial role in the proof of Theorem 1. To obtain 
the stronger conclusion of Theorem 2, one needs a better understanding of the distribution 
of primes in progressions and the Elliott-Halberstam conjecture provides the necessary 
stronger input. Vaguely, the Goldston-Pintz-Yildirim results say that if the primes are 
well separated with no small gaps between them, then something weird must happen to 
their distribution in progressions. 

Given a progression a (mod q) let w(x; q, a) denote the number of primes below x lying 
in this progression. Naturally we may suppose that a and q are coprime, else there is at 
most one prime in the progression. Now there are <f>(q) — this is Euler's (^-function - 
such progressions a (mod q) with a coprime to q. We would expect that each progression 
captures its fair share of primes. In other words we expect that n(x;q, a) is roughly 
n(x)/(f)(q). The prime number theorem in arithmetic progressions tells us that this is true 
if we view q as being fixed and let x go to infinity. 

In applications, such as Theorem 1, we need information on n(x;q, a) when q is not 
fixed, but growing with x. When q is growing slowly, say q is like logx, the prime number 
theorem in arithmetic progressions still applies. However if q is a little larger, say q is of size 
x 3, then currently we cannot prove the equidistribution of primes in the available residue 
classes (mod q) . Such a result would be implied by the Generalized Riemann Hypothesis 
(indeed for q up to about y/x) , but of course the Generalized Riemann Hypothesis remains 
unresolved. In this context, Bombieri and Vinogradov showed that the equidistribution of 
primes in progressions holds, not for each individual q, but on average over q (that is, for a 
typical q) for q going up to about y/x. Their result may be thought of as the 'Generalized 
Riemann Hypothesis on average.' 

The Elliott-Halberstam conjecture says that the equidistribution of primes in progres- 
sions continues to hold on average for q going up to x x ~ e for any given positive e. In 
some ways, this lies deeper than the Generalized Riemann Hypothesis which permits only 
q < y/x. 

We hope that the reader has formed a rough impression of the nature of the assumption 
in Theorem 2. We will state the Bombieri- Vinogradov theorem and Elliott-Halberstam 
conjecture precisely in the penultimate section devoted to primes in progressions. 

The Hardy-Littlewood conjectures. We already noticed a faulty feature of the Cramer 
model: given a large prime p, the probability that p + 1 is prime is not 1/ log(p + 1) but 
because p + 1 is even. Neither would we expect the conditional probability of p + 2 being 
prime to be simply l/log(p + 2): after all, p + 2 is guaranteed to be odd and this should 
give it a better chance of being prime. How should we formulate the correct probability 
for p + 2 being prime? More precisely, what should be the conjectural asymptotics for 

#{p < x : p + 2 prime}? 
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The Cramer model would have predicted that this is about x/(\ogx) 2 . While we must 
definitely modify this, it also seems reasonable that x/(\ogx) 2 is the right size for the 
answer. So maybe the answer is about cx/{\ogx) 2 for an appropriate constant c. 

Long ago Hardy and Littlewood [16] figured out what the right conjecture should be. 
The problem with the Cramer model is that it treats n and n + 2 as being independent, 
whereas they are clearly dependent. If we want n and n + 2 both to be prime, then they 
must both be odd, neither of them must be divisible by 3, nor by 5, and so on. If we 
choose n randomly, the probability that n and n + 2 are both odd is 1/2. In contrast, 
two randomly chosen numbers would both be odd with a 1/4 probability. If neither n nor 
n + 2 is divisible by 3 then n must be 2 (mod 3), which has a 1/3 probability. On the 
other hand, the probability that two randomly chosen numbers are not divisible by 3 is 
(2/3) • (2/3) = 4/9. Similarly, for any prime £ > 3, the probability that n and n + 2 are not 
divisible by £ is 1 — 2/£, which is a little different from the probability (1 — l/£) 2 that two 
randomly chosen integers are both not divisible by t. For the prime 2 we must correct the 
probability 1/4 by multiplying by 2 = (1 — 1/2) (1 — 1/2) ~ 2 , and for all primes £ > 3 we 
must correct the probability (1 — l/£) 2 by multiplying by (1 — 2/£)(l — l/£)~ 2 . The idea 
is that if we multiply all these correction factors together then we have accounted for 'all 
the ways' in which n and n + 2 are dependent, producing the required correction constant 
c. Thus the conjectured value for c is the product over primes 

Let us make a synthesis of the argument above, which will allow us to generalize it. For 
any prime £ let ^{0,2} (^) denote the number of distinct residue classes (mod £) occupied 
by the numbers and 2. If we want n and n + 2 to be both coprime to £ then n must 
n must avoid the residue classes occupied by —0 and —2 (mod £), so that n must lie in 
one of £ — ^{0,2} (fy residue classes. The probability that this happens is 1 — ^{0,2} iPil^i so 
the correction factor for £ is (1 — u^ 0j2 } (£)/£) (I — l/£)~ 2 . As before, consider the infinite 
product over primes 

6({0 , 2}): . n (i-^)(i-i)" 2 . 

1 

The infinite product certainly converges: the terms for £ > 3 are all less than 1 in size. 
Moreover, it converges to a non-zero number. Note that none of the factors above is zero, 
and that for large £ the logarithm of the corresponding factor above is very small — it is 
log(l — l/(£ — l) 2 ) ~ — 1/£ 2 . Thus the sum of the logarithms converges, and the product 
is non-zero; indeed ©({0,2}) is numerically about 1.3203. Then the conjecture is that for 
large x 

#{p<x: p + 2prime}~6({0,2})-^— . 

(logx) z 

Here and below, the notation f(x) ~ g(x) means that linx^oo f(x)/g(x) = 1. 

The conjecture generalizes readily: Suppose we are given a set Ti = {hi, /12, • • • , hk} of 
non-negative integers and we want to find the frequency with which n + h±, . . . , n + ht 
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are all prime. For a prime number £, we define vn{£) to be the number of distinct residue 
classes (mod £) occupied by H. We define the 'singular series' 5 

(2> 6( H) =n(i-^f)(i4)" 

t 

If £ is larger than all elements of H then v n (£) = k, and for such £ the terms in the product 
are less than 1. Thus the product converges. When does it converge to a non-zero number? 
If vrtiP) = ^ f° r some prime £ then one of the terms in our product vanishes, and so our 
product must be zero. Suppose none of the terms is zero. For large £ the logarithm of the 
corresponding factor is 

logfl-^l- 1 ^^ Hk + 1) 



£J\ £) 2£ 2 ' 

and so the sum of the logarithms converges, and our product is non-zero. Thus the singular 
series is zero if and only if v-h (£) = £ for some prime £ — that is, if and only if the numbers 
hi, . . . ,hk occupy all the residue classes (mod £) for some prime I. In that case, for any 
n one of the numbers n + hi, . . . , n + hk must be a multiple of £, and so there are only 
finitely many prime /c-tuples n + hi, . . . , n + hk- 

The Hardy-Littlewood conjecture. Let H = {hi, . . . , hk} be a set of positive integers 
such that ©(H) ^ 0. Then 

x 

#{n < x : n + hi, . . . ,n + hk prime} ~ ©(H) 



(logx)' 



It is easy to see that ©({0, 2r}) 7^ for every non-zero even number 2r. Thus the Hardy- 
Littlewood conjecture predicts that there are about ©({0, 2r})x/ (\ogx) 2 prime pairs p and 
p + 2r with p below x. Further, the number of these pairs for which p + 2d is prime for some 
d between 1 and r — 1 is at most a constant times x/(loga;) 3 . We deduce that there should 
be infinitely many primes p for which the gap to the next prime is exactly 2r. Thus every 
positive even number should occur infinitely often as a gap between successive primes, but 
we don't know this for a single even number! 

For any k, it is easy to find /c-element sets H with ©(H) 7^ 0. For example, take H to 
be any k primes all larger than k. Clearly if £ > k then v-h(£) < k < £, while if £ < k then 
the residue class (mod £) must be omitted by the elements of H (they are primes!) and 
so once again vji(£) < £. 

We make one final comment before turning (at last!) to the ideas behind the proofs 
of Theorems 1 and 2. Conjecture 1 was made on the strength of the Cramer model, 
but we have just been discussing how to modify the Cramer probabilities for prime k- 
tuples. A natural question is whether the Hardy-Littlewood conjectures are consistent 

5 The terminology is not entirely whimsical: Hardy and Littlewood originally arrived at their conjecture 
through a heuristic application of their 'circle method.' In their derivation, &{TL) did arise as a series rather 
than as our product. 
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with Conjecture 1. In a beautiful calculation [11], Gallagher showed that Conjecture 1 can 
in fact be obtained starting from the Hardy-Littlewood conjectures. The crucial point in 
his proof is that although is not always 1 (as the Cramer model would have), it is 

approximately 1 on average over all /c-element sets Ti with the hj < h. That is, as h — > oo, 

(3) 6({/*!,..., m)~ E L 

l<h 1 <h 2 <---<h k <h l<h 1 <h 2 <---<h k <h 

The ideas of Goldston, Pintz and Yildirim. We will start with the idea behind 
Theorem 2. Let k be a given positive integer which is at least 2. Let Ti = {hi < . . . < hk} 
be a set with &(H) ^ 0. We aspire to the Hardy-Littlewood conjecture which says that 
there must be infinitely many n such that n + hi, . . . , n + hk are all prime. Since there 
are infinitely many primes, trivially at least one of the numbers n + hi, . . . , n + hk is 
prime infinitely often. Can we do a little better: can we show that two of the numbers 
n + hi, . . . , n + hk are prime infinitely often? If we could, then we would plainly have that 
liminfp^oofpnext -p) < (hk-hi). 

How do we detect two primes in n + hi, n + hk? Let x be large and consider n 
varying between x and 2x. Suppose we are able to find a function a(n) which is always 
non-negative, and such that, for each j = 1, . . . , k, 

(4) £ a(n) > - Yl < n )' 

x<n<2x x<n<2x 
n+hj prime 

Then summing over j = 1, . . . , k, it would follow that 

#{1 < j < k : n + hj prime} a(n) > a(n), 

x<n<2x x<n<2x 

so that for some number n lying between x and 2x we must have at least two primes among 
n + hi, . . . , n + hk- 

Of course, the question is how do we find such a function a(n) satisfying (4)? We would 
like to take a(n) = 1 if n + hi, . . . , n + hk are all prime, and otherwise. But then the 
problem of evaluating J2 x <n<2x a ( n ) * s precisely that of establishing the Hardy-Littlewood 
conjecture. 

The answer is suggested by sieve theory, especially the theory of Selberg's sieve. Sieve 
theory is concerned with finding primes, or numbers without too many prime factors, 
among various integer sequences. Some of the spectacular achievements of this theory are 
Chen's theorem [5] that for infinitely many primes p, the number p + 2 has at most two 
prime factors; the result of Friedlander and Iwaniec [10] that there are infinitely many 
primes of the form x 2 + y 4 where x and y are integers; and the result of Heath-Brown [19] 
that there are infinitely many primes of the form x 3 + 2y 3 where x and y are integers. We 
recall here very briefly the idea behind Selberg's sieve. 

Interlude on Selberg's sieve. We illustrate Selberg's sieve by giving an upper bound 
on the number of prime /c-tuples n + hi, . . . , n + hk with x < n < 2x. The idea is to find a 
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'nice' function a{n) which equals 1 if n + hi, . . . , n + hk are all prime, and is non-negative 
otherwise. Then Ylx<n<2x °( n ) provides an upper bound for the number of prime fc-tuples. 
Of course, we must choose a(n) appropriately, so as to be able to evaluate J2 x <n<2x a ( n )- 
Selberg's choice for a(n) is as follows: Let A^ be a sequence of real numbers such that 

(5) Ai = 1, and with A^ = for d > R. 
Choose 6 

(6) a(n)=( A ^) 2 - 

d\(n+h 1 )...(n+hk) 

Being a square, a(n) is clearly non-negative. If R < x < n and n + hi, . . . , n + hk are 
all prime, then the only non-zero term in (6) is for d = 1 and so a(n) = 1 as desired. 
Therefore we assume that R < x below. The goal is to choose A^ so as to minimize 
Ylx<n<2x a ( n )- There is an advantage to allowing R as large as possible, since this gives 
us greater flexibility in choosing the parameters A^. On the other hand it is easier to 
estimate J2 x <n<2x a ( n ) wnen R is small since there are fewer divisors d to consider. In 
the problem at hand, it turns out that we can choose R roughly of size ^Jx. This choice 
leads to an upper bound for the number of prime /c-tuples of about 2 k ■ k\&(7i)x / (\ogx) k . 
That is, a bound about 2 k ■ kl times the conjectured Hardy-Littlewood asymptotic. 
Expanding out the square in (6) and summing over n, we must evaluate 

d\,d-2 x<n<2x di,d2 x<n<2x 

d\ | (n+h 1 )---(n+h k ) [di ,dz] \ (n+hi ) ■■■(n+hk) 

d 2 \(n+h 1 )---(n+h k ) 

where [d\, d?\ denotes the l.c.m. of di and di- The condition [di, d2]\(n + hi) • • • (n + hk) 
means that n must lie in a certain number (say, f([di, ^2])) of residue classes (mod [di, g^])- 
Can we count the number of x < n < 2x lying in the union of these arithmetic progres- 
sions? Divide the interval [x, 2x] into intervals of length [di,c/2] with possibly one smaller 
interval left over at the end. Each complete interval (and there are about x/[di,d2\ of 
these) gives /([di,^]) values of n; the last shorter interval contributes an indeterminate 
'error' between and f([di, ^2])- So, at least if [di,d 2 ] is a bit smaller than x, we can 
estimate the sum over n accurately. Since [di,d 2 ] < did 2 < R 2 , if R is a bit smaller than 7 
\fx, then the sum over n can be evaluated accurately. Let us suppose that R is about size 
\fx and that the error terms can be disposed of satisfactorily. It remains to handle the 
main term contribution to Ylx<n<2 X a ( n )> namely 

(7) t f([di,d 2 }) 



6 Below, the symbol a\b means that a divides b. 
7 To be precise, R must be < T/x/(\og x) 2k , say. 
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The reader may wonder what f([di, <i 2 ]) is. Let us work this out in the case when [di, d 2 ] is 
not divisible by the square of any prime; the other case is more complicated, but not very 
important in this problem. If p is a prime and we want p\(n + hi) ■ ■ ■ (n + hk) then clearly 
n = —hj (mod p) for some j, so that n lies in one of v-h(p) residue classes (mod p). By 
the chinese remainder theorem it follows that if [d±, d2]\(n + hi) ■ ■ ■ (n + hk) then n lies in 
rip|[di d 2 ] u t-(.(p) residue classes (mod [di,^])- Thus / is a multiplicative function 8 , with 
f(p) = vh{p)- 

The problem in Selberg's sieve is to choose Ad subject to the linear constraint (5) in 
such a way as to minimize the quadratic form (7) (that would give the best upper bound 
f° r Ei<n<2i a ( n ))- This can be achieved using Lagrange multipliers, or by diagonalizing 
the quadratic form (7). We do not give the details of this calculation but just record the 
result obtained. The optimal choice of Ad for d < R is given by 

where fi(d) is the Mobius function. 9 With this choice of A^ the quantity in (7) is 

« k\&{H)-^—j: « 2 k ■ fc!6(W) 7r ^ 15f . 

(log R) k (logx) fc 

The appearance at this stage of the Mobius function is not surprising, as it is very inti- 
mately connected with primes. For example, the reader can check that J2d\m M^O 0-°S rn/d) k 
equals unless m is divisible by at most k distinct prime factors. When m = p\ ■ ■ -pk is 
the product of k distinct prime factors it equals fc!(logpi) • • • (logpfc), and there is a more 
complicated formula if m is composed of fewer than k primes, or if m is divisible by powers 
of primes. Applying this to m = (n + h\) ■ ■ ■ (n + h^), we are essentially picking out prime 
fc-tuples! The optimum in Selberg's sieve is a kind of approximation to this identity. 

Return to Goldston-Pintz-Yildirim. We want to find a non-negative function a(n) 
so as to make (4) hold. Motivated by Selberg's sieve we may try to find optimal A^ as in 
(5) and again choose a(n) as in (6). If we try such a choice, then our problem now is to 
maximize the ratio 

(8) ( Yl a K>/( E a K>- 

x<n<2x x<n<2x 
n+hj prime 

We'd like this ratio to be > 1/k. Notice again that it is advantageous to choose R as large 
as possible to give greatest freedom in choosing X d: but in order to evaluate the sums above 
there may be restrictions on the size of R. In dealing with the denominator we saw that 
there is a restriction R < ^fx (essentially) and that in this situation the denominator in (8) 
is given by the quadratic form (7). We will see below that in dealing with the numerator 

8 These are functions satisfying /(mn) = f(m)f(n) for any pair of coprime integers m and n. 
9 ^(d) = if d is divisible by the square of a prime. Otherwise fj,(d) = (— l) w ( d ) where ui(d) is the 
number of distinct primes dividing d. 
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of (8), a more stringent restriction on R must be made: we can only take R around size 

i 

In any case, (8) is the ratio of two quadratic forms, and this ratio needs to be maxi- 
mized keeping in mind the linear constraint (5). This optimization problem is more delicate 
than the one in Selberg's sieve. It is not clear how to proceed most generally: Lagrange 
multipliers become quite messy, and we can't quite diagonalize both quadratic forms si- 
multaneously. It helps to narrow the search to a special class of A^. Motivated by Selberg's 
sieve we will search for the optimum among the choices (for d < R) 



Here P(y) denotes a polynomial such that P(l) = 1 and such that P vanishes to order at 
least k at y = 0. The condition that P be a polynomial can be relaxed a bit but this is 
not important. It is however vital for the analysis that P should vanish to order k at 0. 
Our aim is to find a choice for P which makes the ratio in (8) large. 

With this choice of A^ we can use standard arguments to evaluate (7) and thus the 
denominator in (8). Omitting the long, technical details, the answer is that for R a little 
below y/x, the denominator in (8) is 

where P^ denotes the fe-th derivative of the polynomial P. 

To handle the numerator of (8), we expand out the square in (6) and sum over x < n < 
2x with n + hj being prime. Thus the numerator is 

d\,d,2<R x<n<2x 

[d 1 ,d 2 ]\(n+h 1 )---(n+h k ) 
n+hj .prime 

How can we evaluate the inner sum over n? As we saw before, the condition [di, d 2 ] divides 
(n + hi) • • • (n + hk) means that n lies in f([d\, efe]) arithmetic progressions (mod [d±, d 2 ])- 
For each of these progressions we must count the number of n such that n + hj is prime. 
Of course, for some of the f([di, g^]) progressions it may happen that n + hj automatically 
has a common factor with [g?i,g?2] and so cannot be prime. Suppose there are g{[d\, cfe]) 
progressions such that n + hj is guaranteed to be coprime to [g?i,g?2]. For each of these 
progressions we are counting the number of primes between x and 2x lying in a reduced 
residue class 10 (mod [di, cfo])- Given a modulus q, the prime number theorem in arithmetic 
progressions says that the primes are roughly equally divided among the reduced residue 
classes (mod q). Thus, ignoring error terms completely, we expect the sum over n to be 
about 

n(2x) — n(x) 



<t>([di,d 2 }) 



-g([d 1: d 2 }). 



10 A reduced residue class (mod q) is a progression a (mod q) where a is coprime to q. 
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The <p( [di , d-2\ ) in the denominator is Euler's ^-function: for any integer m, <j){m) counts 
the number of reduced residue classes (mod m). Since tt(2x) — n(x) is about x/logx we 
'conclude' that the numerator in (8) is about 

nn x x sr^ g([di,d 2 ]) 

ta d!,d 2 <R u Jy 

This is the expression analogous to (7) for the numerator. 

Two big questions: what is the function g, and for what range of R can we handle the 
error terms above? Let us first describe g. As with / let us suppose that [^1,^2] is n °t 
divisible by the square of any prime. As noted earlier, if p is prime and p\ (n+h\) ■ ■ ■ (n+hk) 
then n lies in one of vnip) residue classes (mod p). If we want n + hu to be prime, then 
one of these residue classes, namely n = —hj (mod p), must be forbidden. Thus there are 
now vt-l(p) — 1 residue classes available for n (mod p). In other words, g(p) = v-h{p) ~ 1? 
and the Chinese remainder theorem shows that g must be defined multiplicatively: 

9([diM)= II (»n(p)-i)- 

p\[di,d 2 ] 

We will postpone the detailed discussion on primes in arithmetic progressions which is 
needed to handle the error terms above. For the moment, let us note that the Bombieri- 
Vinogradov theorem (which is a powerful substitute for the generalized Riemann hypothesis 
in many applications) allows us to control ir(x; q, a) (the number of primes up to x which 
are congruent to a (mod q)), on average over q, for q up to about y/x. Since our moduli 
are [d±, c^], which go up to R 2 , we see that R may be chosen up to about x 4. Conjectures 
of Montgomery, and Elliott and Halberstam (discussed below) would permit larger values 
of .R, going up to x 2 -e for any e > 0. 

Thus, with R a little below x*, the expression (10) does give a good approximation to 
the numerator of (8). Now a standard but technical argument can be used to evaluate 
(10). As with (9), the answer is 

Assuming that &(H) 7^ 0, it follows from (9) and (11) that the ratio in (8) is about 

This is the moment of truth: can we choose P so as to make this a little larger than l//c? 

Here is a good choice for P: take P(y) = y k+r for a non-negative integer r to be chosen 
optimally. After some calculations with beta-integrals, we see that (12) then equals 

\ogR\/ 2(2r + l) 



\ogxJ V(r + l)(fc + 2r + 1) 
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This is largest when r is about \fk/2, and the second fraction above is close to but less 
than A/k. Since we can choose R a little below x* , the first fraction is close to but less 
than 1/4. Thus (12) is very close to, but less than, 1/k. We therefore barely fail to prove 
bounded gaps between primes! Of course, we just tried one choice of P; maybe there is 
a better choice which gets us over the edge. Unfortunately, the second fraction in (12) 
cannot be made larger than A/k. If we set Q(y) = P^ k ~ 1 \y) then Q is a polynomial, not 
identically zero, with Q(0) = 0; for such polynomials Q we claim that the unfortunate 
inequality 

holds. The reader can try her hand at proving this. 

We now have enough to prove Theorem 2! If we can choose R a little larger than x 5 
then for suitably large k the quantity in (12) can be made larger than 1/k as desired. If we 
allow R = X2~ e as the Elliott-Halberstam conjecture predicts, then with k = 7 and r = 1 
we can make (12) nearly 1.05 /k > 1/k. Thus, if we take any set H with seven elements 
and &(H) t^ 1 then for infinitely many n at least two of the numbers n + hi, . . . , n + ht 
are prime! By choosing a more careful polynomial P we can make do with six element sets 
H rather than seven. The first six primes larger than 6 are 7, 11, 13, 17, 19, and 23, and 
so 6({7, 11, 13, 17, 19, 23}) ^ 0. Thus, it follows that — assuming the Elliott-Halberstam 
conjecture — there are infinitely many gaps between primes that are at most 16. 

What can we recover unconditionally? We are so close to proving Theorem 2 uncondi- 
tionally, that clearly some tweaking of the argument must give Theorem 1! The idea here 
is to average over sets Ti. For clarity, let us now denote a(n) above by a(n; Ti) to exhibit 
the dependence on Ti. 

Given e > we wish to find primes p between x and 2x such that p nex t — P < elogx. 
This would prove Theorem 1. Set h = elogx, and let k be a natural number chosen in 
terms of e, but fixed compared to x. Consider the following two sums: 

(13) Yl a{n;{hi,... ,h k }), 

l<hi<h 2 <-..<h k <h x<n<2x 

and 

(14) Yl EE <n;{hi,...,h k }). 

l<h 1 <h 2 <...<h k <h \<t<h x<n<2x 

n-\-£ prime 

If we could prove that (14) is larger than (13), it would follow that for some n between x 
and 2x, there are two prime numbers between n + 1 and n + h, as desired. 

Our analysis above already gives us the asymptotics for (13) and (14). Using (9) we see 
that the quantity (13) is 



SMALL GAPS BETWEEN PRIMES 



15 



and using Gallagher's result (3) this is 

Now let us consider (14). Here we distinguish two cases: the case when £ = hj for some 
j, and the case when £ ^ hj for all j. The former case is handled by our analysis leading 
up to (11). Upon using (3) again, these terms contribute 

(i6) I,,, X t i^P^X^-vfdy. 



(log x) (log R) k ~ 1 k\ J (k - 2)1 

If we choose P(y) = y k+r as before, we see that (16) is already just a shade below (15), so 
we need the slightest bit of extra help from the terms £ ^ hj for any j. If n + £ is prime 
note that 

a(n; {hi, ... , h k }) = ( X d ^j 

d\(n+h 1 )---(n+h k ) 

= y ^2 A d J =a(n;{h 1 ,...,h k ,£}), 

d\(n+h 1 )---(n+h k )(n+£) 

since the divisors counted in the latter sum but not the former are all larger than n + £ > 
x > R and so = for such divisors. This allows us to finesse the calculation by simply 
appealing to (11) again, with k replaced by k + 1 and {hi, ... , h k } by {hi, . . . , h k , £}. 
Thus the latter class of integers £ contributes 

E L ^ mfc 6({fti,---,frfe,*» f lT : \^^ p{k) { 1 -y? d y■ 
Appealing to (3) again — we are now summing over k + 1 element sets but each set is 
counted k + 1 times — this is 

This accounts for a factor of e times the quantity in (15), and now the combined contri- 
bution of (16) and (17) may be made larger than (15), proving Theorem 1! 

Primes in arithmetic progressions. It remains to explain what is meant by the 
Bombieri-Vinogradov theorem and the Elliott-Halberstam conjecture. Recall that we re- 
quired knowledge of these estimates for primes in progressions while discussing the error 
terms that arise while evaluating the numerator of (8). 
Let us write 

n(x) = \i(x) + E{x), 
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where li(x) stands for the 'logarithmic integral' which is the expected main term, 

and E(x) stands for an 'error term'. The main term \i(x) is, by integration by parts, 
roughly x/logx. As for the error term E(x), the standard proofs of the Prime Number 
Theorem give that for any number A > there exists a constant C(A) such that 

\E(x)\<C(A)- ' 



(\ogx) A ' 

The argument generalizes readily for primes in progressions. Given an arithmetic progres- 
sion a (mod q) with (a, q) = 1 let us write 

n(x; q, a) = —^-^(x) + E(x; q, a), 

where \i(x)/(/)(q) is the expected main term — the primes are equally divided among the 
available residue classes — and E(x; q, a) is an 'error term' which we would like to be small. 
As with the Prime Number Theorem, for every A > there exists a constant C(q, A) such 
that 

\E(x;q,a)\<C(q,A)- 



(logx) A ' 

We emphasize that the constant C(q, A) may depend on q. Therefore, this result is mean- 
ingful only if we think of q as being fixed and let x tend to oo. In applications such a result 
is not very useful, because we may require q not to be fixed, but to grow with x. For exam- 
ple, in our discussions above we want to deal with primes in progressions (mod [^1,^2]) 
which can be as large as R 2 , and we'd like this to be of size x 2 and would love to have 
it be even larger. Thus the key issue while discussing primes in arithmetic progressions is 
the uniformity in q with which the asymptotic formula holds. 

What is known about tt(x; q, a) for an individual modulus q is disturbingly weak. From 
a result of Siegel we know that for any given positive numbers N and A, there exists a 
constant c(N,A) such that if q < (\ogx) N then 

\E(x; q,a)\ < c(N, A)- '' 



(logx) 



A ' 



This is better than the result for fixed q mentioned earlier, but the range of q is still 
very restrictive. An additional defect is that the constant c(A, A) cannot be computed 
explicitly 11 in terms of N and A. 

If we assume the Generalized Riemann Hypothesis (GRH) then we would fare much 
better: if x > q there exists a positive constant C independent of q such that 

\E(x; q,a)\ < Cx^ logx. 

This gives a good asymptotic formula for 7r(x; q, a) in the range q < / (logx) 3 , say. 
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This is not due to laziness, but is a fundamental defect of the method of proof. 
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Given a modulus q let us define 

E(x;q)= max \E(x;q,a)\. 

(a,q) = l 

We have discussed above the available weak bounds for E(x; q), and the unavailable strong 
GRH bound. Luckily, in many applications including ours, we don't need a bound for 
E(x; q) for each individual q, but only a bound holding in an average sense as q varies. In 
the application to small gaps, we want primes in progressions (mod [di,^]), but recall 
that we also have a sum over d\ , d<i going up to R. An extremely powerful result of Bombieri 
and Vinogradov gives such an average estimate for E{x;q). Moreover, this average result 
is nearly as good as what would be implied by the GRH. 

The Bombieri- Vinogradov theorem. For any positive constant A there exist constants 
B and C such that 

(18) V max|£(y;g)| < C - 

^ Q y<x (\ogx) A 

with Q = x^ /(logx) B . 

The constant B can be computed explicitly; for example B = 24A + 46 is permissible, 
but the constant C here cannot be computed explicitly (a defect arising from Siegel's 
theorem mentioned above). The Bombieri-Vinogradov theorem tells us that on average 
over q < Q we have E(x;q) < Cx(\og x)~ A /Q = Cx^{\ogx) B ~ A . Apart from the power 
of logo;, this is as good as the GRH bound! 

A straight-forward application of the Bombieri-Vinogradov theorem shows that as long 
as R 2 < X? / (\ogx) B for suitably large B, the error terms arising in the Goldston-Pintz- 
Yildirim argument will be manageable. If we wish to take R larger, then we must extend 
the range of Q in (18). Such extensions are conjectured to hold, but unconditionally the 
range in (18) has never been improved upon 12 . 

The Elliott-Halberstam conjecture. Given e > and A > there exists a constant 
C such that 

m&x\E(x;q) \ < C- -j-, 

with Q = x 1 ~ e . 

The Elliott-Halberstam conjecture would allow us to take R = x^~ e in the Goldston- 
Pintz-Yildirim argument. It is worth emphasizing that knowing (18) for Q = x 9 with any 
6 > \ would lead to the existence of bounded gaps between large primes. 

Finally, let us mention a conjecture of Montgomery which lies deeper than the GRH 
and also implies the Elliott-Halberstam conjecture. 



Although, Bombieri, Friedlander and Iwaniec [4] have made important progress in related problems 
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Montgomery's conjecture. For any e > there exists a constant C(e) such that for all 
q < x we have 

E(x;q) < C{e)x^ +€ q~^. 

We have given a very rapid account of prime number theory. For more detailed accounts 
we refer the reader to the books of Bombieri [2], Davenport [6], and Montgomery and 
Vaughan [24]. 

Future directions. We conclude the article by mentioning a few questions related to the 
work of Goldston-Pintz-Yildirim. 

First and most importantly, is it possible to prove unconditionally the existence of 
bounded gaps between primes? As it stands, the answer appears to be no, but perhaps 
suitable variants of the method will succeed. There are other sieve methods available 
beside Selberg's. Does modifying one of these (e.g. the combinatorial sieve) lead to a 
better result? If instead of primes we consider numbers with exactly two prime factors, 
then Goldston, Graham, Pintz, and Yildirim [13] have shown that there are infinitely many 
bounded gaps between such numbers. 

In a related vein, assuming the Elliott-Halberstam conjecture, can one get to twin 
primes? Recall that under that assumption, we could show that infinitely many permissible 
6-tuples contain two primes. Can the 6 here be reduced? Hopefully, to 2? Again the 
method in its present form cannot be pushed to yield twin primes, but maybe only one or 
two new ideas are needed. 

Given any e > 0, Theorem 1 shows that for infinitely many n the interval [n, n + elogn] 
contains at least two primes. Can we show that such intervals sometimes contain three 
primes? Assuming the Elliott-Halberstam conjecture one can get three primes in such 
intervals, see [12]. Can this be made unconditional? What about k primes in such intervals 
for larger kl 

Is there a version of this method which can be adapted to give long gaps between primes? 
That is, can one attack Erdos's $10,000 question? 
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